Activation Function Analysis

Experiment Controls

实验控制

Training Parameters

训练参数

Learning Rate 学习率 0.0010

Network Depth 网络深度 2 Layers

Dataset Configuration

数据集配置

Activation Functions

激活函数

ReLU

Rectified Linear Unit

修正线性单元

f(x)=max(0,x)

Prevents vanishing gradient for positive inputs. Computationally efficient but can suffer from "dying neurons".

防止正输入的梯度消失。计算效率高，但可能遭受"神经元死亡"问题。

Sigmoid

Logistic Function

逻辑函数

f(x)=1/(1+e⁻ˣ)

Outputs between 0 and 1. Useful for probability outputs but suffers from vanishing gradients.

输出在0和1之间。适用于概率输出，但存在梯度消失问题。

Tanh

Hyperbolic Tangent

双曲正切

f(x)=tanh(x)

Zero-centered output between -1 and 1. Better than sigmoid for hidden layers but still has gradient issues.

零中心输出，范围在-1到1之间。对于隐藏层比Sigmoid更好，但仍然存在梯度问题。

Linear

Identity Function

恒等函数

f(x)=x

No transformation applied. Used as a baseline comparison and for output layers in regression.

不应用变换。用作基线比较和回归中的输出层。

Function Performance

函数性能

Epoch: 0 轮次: 0

Linear

MSE: 均方误差: 0.000000

Gradient: 梯度: 0.0000

ReLU

MSE: 均方误差: 0.000000

Gradient: 梯度: 0.0000

Sigmoid

MSE: 均方误差: 0.000000

Gradient: 梯度: 0.0000

Tanh

MSE: 均方误差: 0.000000

Gradient: 梯度: 0.0000

Gradient Analysis

Average gradient magnitude in the first hidden layer

梯度分析

第一个隐藏层中的平均梯度大小

Linear

ReLU

Sigmoid

Tanh

Technical Details

This visualization demonstrates how different activation functions perform when training neural networks on various datasets. The interface allows you to:

Compare ReLU, Sigmoid, Tanh, and Linear activation functions
Adjust network depth and learning rate parameters
Visualize gradient flow and vanishing gradient problems
Test performance on different dataset complexities

Observe how ReLU maintains strong gradients while Sigmoid and Tanh suffer from vanishing gradients, especially in deeper networks.

技术细节

此可视化展示了在不同数据集上训练神经网络时，不同激活函数的性能表现。界面允许您：

比较ReLU、Sigmoid、Tanh和Linear激活函数
调整网络深度和学习率参数
可视化梯度流和梯度消失问题
在不同复杂度的数据集上测试性能

观察ReLU如何保持强梯度，而Sigmoid和Tanh如何遭受梯度消失问题，尤其是在更深的网络中。